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Abstract 

H 

Prior to computing the Cholesky factorization of a sparse, symmetric 
positive definite matrix, a reordering of the rows and columns is computed 
so as to reduce both the number of fill elements in Cholesky factor and the 

C" ' number of arithmetic operations (FLOPs) in the numerical factorization. 

These two metrics are clearly somehow related and yet it is suspected that 
these two problems are different. However, no rigorous theoretical treat- 

■ 2* ment of the relation of these two problems seems to have been given yet. 

In this paper we show by means of an explicit, scalable construction that 
the two problems are different in a very strict sense. In our construction 

.4— > no ordering, that is optimal for the fill, is optimal with respect to the 

number of FLOPs, and vice versa. 

Further, it is commonly believed that minimizing the number of FLOPs 

1 1 is no easier than minimizing the fill (in the complexity sense), but so far 

no proof appears to be known. We give a reduction chain that shows the 
NP hardness of minimizing the number of arithmetic operations in the 
Cholesky factorization. 

in 
o 

1 Introduction 

m 

Let A e R nxn be a sparse, real, symmetric positive definite (SPD) matrix 
and consider the Cholesky factorization of A with symmetric pivoting, that is, 
PAP T = LL T , where L is a lower triangular matrix and P is a permutation 
matrix. Assuming no accidental cancellation, the nonzero pattern of L + L T de- 
k> pends solely on the choice of P and contains the nonzero pattern of A. Nonzero 

elements of L at positions that are structural zeros in A are called fill elements. 
Determining a permutation matrix P, such that the number of these fill ele- 
ments is minimum, is an NP hard problem [22]. Since the arithmetic work in 
terms of floating point operations for the computation of the Cholesky factor L 
is solely determined by the permutation matrix P as well, one may wonder how 
the number of fill elements and arithmetic work are related. In this paper we 
study this relationship and give an NP hardness result for the minimization of 
the arithmetic work. 
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Gaussian elimination for symmetric matrices is very conveniently described 
in terms of undirected graphs. For example, the Cholesky factorization of A can 
be seen as an embedding of the graph G(A) of A into a triangulated supergraph 
G + of G, meaning that all cycles of length at least four in G + have a chord 
(also known as chordal or rigid circuit graphs). In this work we adhere to this 
graph theoretic language. Useful references for the terminology and concepts 
we assume in this paper are [20] and [12]. 

Let G — (V, E) be a simple undirected graph with n vertices. The Mini- 
MUmFill problem asks for a set of edges F C V x V of minimum cardinality 
such that G + = (V,EU F) is chordal. Chordal graphs are also characterized by 
the existence of a perfect elimination ordering (PEO). 

Let a : V {1, . . . , n} be a PEO for G + = (V, E U F). When carrying out 
vertex elimination on G + according to a, denote by d(a _1 (z)) the degree of the 
i-th vertex in the course of the elimination process (the elimination degree of 
a _1 («)). Minimizing the quantity 

n 

nnz(a) = ^(d( a - 1 W) + l) 

over all triangulations F is equivalent to minimizing \F\, since nnz(a) = + 
|F| + n. It is important to note that nnz(.) does not depend on the particular 
PEO chosen, it is a quantity that solely depends on F. 

If G is the graph of a sparse symmetric positive definite matrix A, then 
nnz(a) is the number of nonzero elements in the Cholesky factor of A when 
carrying out the factorization in the ordering a. Another metric of interest 
in this context is the number of floating point operations (FLOPs) that are 
required for the computation of the Cholesky factor in the given ordering a. 
If we account for all additive, multiplicative and square-root operations for the 
computation of the Cholesky factor, the total number of such FLOPs is given 

by 

n 

flop( a )=^(d( a - 1 W) + l) 2 . 
l=\ 

Again this metric does not depend on the particular PEO chosen [20, 7], and 
minimizing flop(o:) over all triangulations of G is the MinimumFLOPs problem. 

The MinimumFLOPs problem has received much less attention in the lit- 
erature than the MinimumFill problem. It is occasionally noted that the two 
metrics are related (e.g. [10, §7], [19, ch. 59]) and it is occasionally noted that 
the two problems are believed to be different (e.g. [20, sec. 4.1.2]). However, a 
rigorous investigation of the relation of these two problems seems to be missing 
in the published literature. 

In section 2 we discuss a class of graphs, parameterized by the number 
of vertices, for which all optimal orderings with respect to either one metric 
are strictly suboptimal for the other. A third ordering problem to which we 
relate these findings is the Treewidth problem. In the context of multifrontal 
methods [6, 16], this problem asks for an elimination ordering such that the 
largest front size is minimum [4]. It is also a parameter in the lower bound for 
the amount of communication in the parallel sparse Cholesky factorization, for 
that it determines the size of a largest dense submatrix that has to be factorized. 
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Finally, we briefly discuss ordering heuristics from the viewpoint of the minimum 
FLOPs problem. 

In section 3 we give a formal NP hardness result for MinimumFLOPs. While 
it is well known that minimizing the fill is NP hard [22] and one expects that 
minimizing the number of arithmetic operations is no less difficult, it seems that 
such a proof has not been given before. 

1.1 Notation 

We use the following notation throughout this paper. The disjoint union of two 
sets P, Q is denoted by P U Q. For a graph G = (V, E) and a vertex v £ V, we 
denote by Mg {v) C V the neighborhood of v in G, that is, the vertices adjacent 
to v. The closed neighborhood of v is No [v] ■= Na {v) U {v}. Denote vertex 
degree and closed vertex degree of v by do{v) — \Mg {v)\ and do[v] = \Mg [v]\, 
respectively. We omit the reference to the graph G in the notation whenever the 
context permits. Sometimes we explicitly refer to the vertex and edge sets of a 
graph G by V(G) and E{G). A graph parameter of interest is the size of the 
largest clique in a given graph G (the clique number), which we denote by u>(G). 
Finally for a natural number k <E N we abbreviate the set {1,. .. ,k} =: [k]. 
Using this notation we formally restate the two problems of interest as decision 
problems (recall that d(a~ 1 (i)) refers to the elimination degree and notice that 
d(a- 1 (z)) + l =d[a- 1 (i)]). 



MinimumFill 

Instance: Graph G = (V,E),n= \V\,k e N 

Question: Exists a set of edges F CV xV such that (V, EUF) 

has a PEO a : V -> [n] with ^Li d[a _1 («)] < kl 



MinimumFLOPs 

Instance: Graph G = (V,E),n= \V\,k e N 

Question: Exists a set of edges F C V x V such that (V, EUF) 

has a PEO a : V -> [n] with J2?=i d l a ^ i^)? < fc? 

2 Minimum fill and minimum FLOPs are differ- 
ent 

In this section we present a class of graphs for which minimizing fill and mini- 
mizing FLOPs are different problems. Interestingly, a structurally similar class 
of graphs is used in [14, p. 14] to show that MinimumFill and Treewidth 
are different. The treewidth problem is yet another NP hard problem [3] that 
can be formulated as an embedding problem into the class of chordal graphs. 

Treewidth 

Instance: Graph G = (V,E),n= \V\,k e N 

Question: Exists a set of edges F C V x V such that G + = 

(V, EUF) is chordal and has lu(G+) < kl 



Note that for a chordal graph G with PEO a we have ui(G) = max; d[a 1 (i)}- 
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Figure 1: The graph G(l,t,c) 



We will show that MinimumFill, MinimumFLOPs and Treewidth are 
different problems in a very strict sense. In section 2.1 we explore all minimal 
triangulations of a parameterized class of graphs (again, see [12] for an overview 
of the terminology). Using specific values for the parameters in section 2.2, we 
show that minima for the three optimization problems are attained at distinct 
triangulations. Finally, in section 2.3 we discuss the minimum FLOPs problem 
from the viewpoint of ordering heuristics. 

2.1 An instructive class of graphs 

In this section we study a class of graphs whose set of minimal triangulations is 
sufficiently simple to analyse and yet general enough to show that the extrema 
of minimum fill and minimum FLOPs are attained at different triangulations. 
In [14, p. 14] it is pointed out that MinimumFill and Treewidth are different 
problems using graphs from this class. In that monograph the author refers to 
an unpublished report for the details. Our study covers this aspect as well. 
A useful reference for all facts and results on minimal triangulations which we 
assume here is the survey [12]. 

For numbers 4<t,(,c6N the graph we want to study is G x (S t U K c ), 
where C; is a cycle on I vertices, S t is an independent set on t vertices and K c 
is the complete graph on c vertices (see Fig. 1). First we will characterize all 
minimal triangulations of G. In fact only two types of triangulations exist; they 
are shown in Fig. 2. 

Proposition 2.1. The graph G := C; x (S t U K c ) has exactly two types of 
minimal triangulations T\ = Ki x (St U K c ) and T2 = C ; + x K t + C , where C ( + is 
a minimal triangulation of Ci . 

Proof. It is easy to verify that T\ and T 2 are indeed chordal graphs for that 
corresponding PEOs are readily constructed. Let T be a minimal triangulation 
of G. Then there exists a minimal elimination ordering a : V(G) — > [I + t + c] 
for G whose resulting filled graph is T. Let v = a _1 (l) be the first vertex 
to be eliminated and denote the graph arising from eliminating v by G+. We 
distinguish three cases: 
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Figure 2: The two types of triangulations of G(l,t,c), T\ and T 2 . Gray edges 
are fill edges. 



Case 1 : v £ V(K C ). Then G+ = x (S t U X c _i), which is a chordal 
graph. Since a is a MEO for G, {a~ 1 (2), . . . ,a^ 1 (n)} is a PEO for G+ and so 
T ^ Ti. 

Case 2 : v £ V(S t ). Then G+ = K l x U A" c ), which is a chordal graph. 

Since a is a MEO for G, {a' 1 (2), . . . , aT^n)} is a PEO for G+ and soT = Ti. 

Case 3 : w € V(Cj). Then G+ = G;_i x if t+c . In this graph the only 
chordless cycle of length at least four can possibly be G;_i. So the minimal 
triangulations of G+ are now given by the minimal triangulations of G;_i, which 
implies that T = T 2 . 

It remains to show that T\ and T 2 are minimal. We do so by showing that 
in both triangulations every fill edge is the unique chord of some four-cycle 
in Ti,T 2 . For Ti consider any fill edge / = (ci,Cj) in V(Ci) x V(Ci) and 
s e V(S t ),v £ V(K C ). Then (s,Ci 7 v 7 Cj,s) is a four-cycle in T\ whose unique 
chord is /. For T 2 let / = (s,v) be a fill edge with s £ V(S t ),v £ V(C k ) and 
ci, C2 two non-adjacent vertices in T 2 . Then (ci, s, c 2 , u, Ci) is a four-cycle in T 2 
whose unique chord is /. □ 

We will now determine the elimination degree sequence of certain PEOs 
for the triangulations T\ , T 2 and count the number of nonzero elements in the 
corresponding Cholesky factors as well as the number of FLOPs necessary to 
compute them. 

A PEO a.\ for Ti is given by ordering the t vertices of S t first, followed 
by any ordering of the remaining complete graph of size / + c. For the degree 
sequence we obtain: 

{d(a^m l +\ +c = {l}] =1 U {I + c 

Given that degree sequence, the nonzero-, FLOP count and clique number for 
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the Cholesky factor corresponding to T\ is given by 

nnz( ai ) = ^(d(a r 1 (.?)) + 1) = t(l + 1) + « 
flop(a 1 ) = ^(dK^j)) + I) 2 = t(l + l) 2 + J 2 (2) 

3 3=1 

ui{a\) = maxd(a^ 1 (i)) + 1 = I + c (3) 

i 

Another PEO for T\ is obtained by ordering the vertices of K c first, followed 
by the vertices of St and finally the vertices of K\. Of course, the expressions 
(l)-(3) are the same for all PEOs. 

A PEO a 2 for the triangulation T 2 is obtained by the first I — 2 vertices of a 
PEO for C ; + followed by an arbitrary ordering of the vertices of the remaining 
Kt+c+2- Noting that for every PEO of C z + the elimination degree of the first 
/ — 2 vertices is t + c + 2, we obtain the degree sequence 

{d(« 2 - 1 ( i ))}!± t 1 +c = {t + c + 2} l r\ u {t + c + 2- j}^J+ 2 . 

The resulting nonzero-, FLOP count and clique number is: 

t+c+2 

nnz( a2 )=^(d( a2 - 1 (. 7 )) + l) = (Z-2)(t + c + 3)+ ]T. ? (4) 
flop( a2 )=^(d( a2 - 1 (. ? )) + l) 2 = (/-2)(t + c + 3) 2 + ]T f (5) 

3 J=l 

w(a 2 ) = maxd(a 2 ~ 1 (i)) + 1 = t + c + 3 (6) 

i 

2.2 Minimizing FLOPs, fill and treewidth are different 
problems 

Let 64 < n e N and set I = 8n,t = 5n,c = An and consider the class of graphs 
from section 2.1 with these parameters. We will count the number of nonzeros 
and FLOPs for the two triangulations. Using (l)-(3) and (4)-(6) we obtain 

225 

nnz(ai) = 112n 2 + O (n) nnz(a 2 ) = -^y^ 2 + O (n) 

flop(ai) = 896n 3 + O (n 2 ) flop(a 2 ) = 891n 3 + O (n 2 ) 

u(ai) — 12n w(a 2 ) = 9n + 3, 

and it is readily verified that the omitted lower order terms arc dominated by 
the leading terms if n > 64. So for this choice of values for I, t, c, we see that a\ 
yields the optimal triangulation for the fill, but not for the number of FLOPs 
or the size of the largest clique. The latter two metrics are minimized by a 2 , 
which is suboptimal for the fill. 

If the values I — 2n + 3, t = n, c = 2n, n > 3, are chosen, one obtains 
the class of graphs from Kloks' example [14, p. 14]. In that case a.\ minimizes 
both the fill and the number of FLOPs, but not the size of the largest clique. 
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The minimum clique size is attained by a^, which is suboptimal for the fill and 
FLOPs: 

21 

nnz(ai) = 10n 2 + O (n) nnz(a 2 ) = yn 2 + O (n) 

76 

flop(ai) = — n 3 + O (n 2 ) nop(a 2 ) = 27n 3 + O (n 2 ) 

u>(o>i) = 4n + 3 01(012) = 3n + 3 

Theorem 2.2. The three chordal graph embedding problems MinimumFill ; 
MinimumFLOPs and TREEWIDTH are different in the sense that no two such 
metrics can be minimized simultaneously in general. 

The values for the parameters l,t, c have been chosen so that the resulting 
edge- and FLOP counts are easy to calculate. There exist other values for l,t 
and c so that the differences between the two triangulations are a little bit larger. 



2.3 Minimum FLOPs and heuristics 

The minimum degree (MD) heuristic and its variations (e.g. AMD [1], MMD [15]) 
are a popular class of ordering heuristics commonly used to reduce the num- 
ber of fill elements in the Cholcsky factor. These heuristics use the elimination 
degree of the vertices as their primary local criterion for ordering the vertices. 
Note that this criterion is in fact the canonical local criterion for minimizing 
the FLOPs and not the fill in, in which context MD type heuristics are usually 
put. 

The canonical criterion for locally minimizing the number of fill elements 
is the deficiency of a vertex, which accounts for the number of fill edges the 
elimination of the vertex would imply. It has been observed [17, 21] that using 
this criterion (or approximations of it) instead of the elimination degree usually 
results in fewer arithmetic (and fill). In fact, the authors of [21] regard reducing 
the number of FLOPs as their primary objective for their experiments with the 
deficiency criterion. 

Reported experimental results for ordering heuristics like the ones above cer- 
tainly have contributed to the common understanding that reducing the number 
of fill elements usually goes hand in hand with reducing the number of arith- 
metic operations and vice versa. While this behaviour is typically observed 
when ordering heuristics are benchmarked, it is worth pointing out that it may 
actually happen in practice that an ordering that implies fewer fill over another 
ordering actually causes significantly more FLOPs (or vice versa). 

To confirm this we conducted a very simple experiment. We computed 
the ordering statistics for 1130 pattern symmetric matrices from the Univer- 
sity of Florida sparse matrix collection [5] using AMD (2.3.0) [2] and METIS 
(4.0.3) [13]. For 91 of these matrices one heuristic produced fewer fill elements 
over the other while performing worse with respect to the FLOP count at the 
same time. For example, for the matrix 1644 from the UF collection, a struc- 
tural problem, AMD produces about 2% fewer fill elements than METIS while 
requiring approximately 22% more arithmetic operations. 
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Figure 3: Linear and quadratic arrangement of a graph. The quadratic cost 
function here is f(x) — x 2 , so this is an instance of OQA(O). The cost of the 
linear arrangement is 5, while the quadratic cost is 27. 



3 Minimizing FLOPs is NP hard 

We now show that minimizing the FLOP count in sparse Cholesky factorization 
is indeed an NP hard problem. To do so, we reduce the MaxCut problem to a 
certain class of quadratic arrangement problems in section 3.1. In section 3.2 we 
reduce such a quadratic arrangement problem to the minimum FLOPs problem 
via a quadratic variation of the bipartite chain graph completion problem. 

3.1 Quadratic vertex arrangement problems 

In the optimal linear arrangement problem [9, GT42]), we are given a graph G = 
(V, E) and are asked to arrange the vertices of G at positive integer positions 
on the real line such that the sum of the implied edge lengths is minimum: 



OptimalLinearArrangement (OLA) 

Instance: Graph G — (V, E) on n vertices, k G N 

Question: Exists bijection a : V — > [n] s.t. J2( u v)eE \ a i u ) ~ a { v )\ < 

k? 



OLA is also known as MinimumOneSum (MIS) and minimizes the 1-norm 
of a vector of distances implied by the linear arrangement of the vertices of the 
graph. Other norms have been considered; for the 2-norm (MinimumTwoSum, 
M2S) and the infinity norm (Bandwidth) the corresponding arrangement prob- 
lems are known to be NP hard [18, 11]. In contrast to these arrangement prob- 
lems, the class of arrangement problems we discuss here cannot be expressed in 
terms of a p-norm of the distance vector. 

Instead of laying out the vertices of G at equally spaced positions, we con- 
sider certain quadratically spaced positions (see Fig. 3). We call this the Op- 
timalQuadraticArrangement(c) (OQA(c)) problem. Let 

c = c 2 X 2 + Cl X + c , co,ci,c 2 eN (7) 

be a polynomial of degree at most 2 with non-negative integer coefficients. We 
regard c as a parameter for the function 

/ : [n] — > Z+, x t-^ x 2 + c(n)x. 

Then the positions on the real line at which we place the vertices of G are given 
by /([n]). Notice that / is a bijection. Allowing for a minor abuse of notation 
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we will sometimes write c instead of c(n) when it can be seen from the context 
whether the integer c(n) or the polynomial c is referred to. Formally we define 
the following class of decision problems, parametrized by the polynomial c as 
follows: 

OptimalQuadraticArrangement(c) (OQA(c)) 

Instance: Graph G — (V, E) on n vertices, k € N 

Question: Is there a bijection a : V — > [n] such that 

E ( „ lB)6B l/(«(«))-/(«(«))l = \a(u) 2 -a(v) 2 +c(a(u)-a(v))\ < 
kl 

For example, when c is the zero polynomial, this includes the problem where 
the vertex positions are laid out according to the mapping x ^ x 2 . In section 
3.1.2 we will prove that OQA(c) is NP hard for every choice of the polynomial 
c in (7). 

3.1.1 Basic properties of the OQA problem 

We will now discuss a few properties of the OQA problem and introduce some 
useful notation for later use. Given a graph G = (V, E) on n vertices, a bijection 
a : V ^ N C Z + and the quadratic function f(x) = x 2 + c(n)x, we denote the 
quadratic cost of such an arrangement by 

q(a):= £ \f(a(u)) - f(a(v))\, 
(u.v)eE 

and the corresponding linear cost for the arrangement by 

K«) : = ~ <x(v)\. 

(u.v)eE 

For an edge e = (u, v) e E, we sometimes write its implied quadratic cost under 
the ordering a as 

cj> a (e) = \f(a(u))-f(a(v))\, 
where we may drop the index a if the ordering is implied by the context. 

Definition 3.1. For a given ordering a : V — > [n] and a non-negative integer 
r, we denote by a + r the following translated ordering: 

a + r : V ^ {1 + r, . . . ,n + r} 

v i ^ a(v) + r 

Translated orderings are actually not consistent with the definitions of the 
arrangements problems (we required a to map onto [n] ) . Our breach of correct- 
ness is harmless for the way we employ translated orderings. 

The linear arrangement cost is translation invariant, since 

l(a + r)= Y \(a(u) + r) - (a(v) + r)\ = ^ \a(u) - a(v)\ = 1(a) , 

(u,v)eE (u,v)£E 

but the quadratic arrangement costs of the two orderings are different; a trans- 
lation results in a linear change of the arrangement cost: 
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Lemma 3.2 (translation lemma). For an ordering a : V — >• [n] and a displace- 
ment r G N we have 

q(a + r) = q(a) + 2rl(a). 

Proof. We can assume that in the given ordering a, we have for an edge (it, v) € 
E that a(u) < a(v) (otherwise call this undirected edge (v,u) instead). 

q{a + r)= ^ ((("<» + r f + c ( a ( v ) + r )) ~ i( a ( u ) + r f + c ( a ( u ) + r ))) 
(u.v)eE 

= ( a ( v ) 2 + 2a(v)r — a(u) 2 — 2a(u)r + ca(v) — ca(u)) 

(u.v)eE 

= q(a) + 2r\(a). 

□ 

Denote by K s the complete graph on s vertices. Both the quadratic and 
linear costs for arranging K s are independent of the chosen bijection a. Ele- 
mentary counting immediately gives that the linear arrangement cost of K s is 
is(s 2 — 1). The quadratic cost is given by the following lemma. 

Lemma 3.3. Let a : V{K S ) — > [s] be an arrangement of K s . Then we have 

q{a) = ^s{s 2 -l)(c + s + l). 

Proof. We say that the vertices u,v have ordering distance d, 1 < d < s — 1, if 
\a(u) — a(v)\ = d. The cost implied by all edges between vertices of ordering 
distance d is 

s—d s—d 

((k + df + c(k + d) - k 2 - ck) = 2d^k + d(d + c){s - d) 
k=i fc=i ( 8 ) 

= d(s - d)(s - d + 1) + d(d + c)(s - d) = d(c + s + l)(s- d). 

The total cost of a is the sum of (8) over all the distances 1 < d < s — 1, so we 
find 

S— 1 5—1 5—1 

q(a) =^rf(c + s + l)(s-rf) = (c + s + l)(s^]d-^d 2 ) 

d=l d=l d=l 

= (c + ,s + l)(i S 2 ( S -l)-l S ( S -l)(2 S -l)) 
= is(s 2 -l)(c + s + l). 
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□ 



Lemma 3.4. Let a : V(K S ) — > [s] &e an arrangement of K s and r G Z + . T/ien 
we /iawe 

g(a + r) = -s(s 2 - l)(2r + c + s + 1). 
o 

Proof. Direct application of the translation lemma to Lemma 3.3. □ 

It is easy to see that the OQA problem is different from the OLA problem 
in the same sense as MinimumFill and MinimumFLOPs are different (see 
Appendix A). 
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3.1.2 OQA(c) is NP hard 

We will now show that OQA(c) is an NP hard problem for every choice of the 
polynomial c in (7). Our strategy to reduce from MaxCut follows along the 
lines of the reduction from MaxCut to OLA in [8, chap. 8], but the details are 
very much different. 

The reduction will reduce MaxCut to the maximization version of OQA. 
Thus we show first that maximization and minimization of the quadratic ar- 
rangement are equivalent (in the complexity sense). 

Proposition 3.5. MaxOQA(c) and MinOQA(c) are equivalent. 

Proof. Let (G = (V, E);k), \V\ = n, be an instance of MaxOQA(c) and define 
(G, k' := \n(n 2 - l)(c + n+l) - fc) to be an instance of MinOQA(c) (G is the 
complement of G). Denote by E the set of edges of G, then by Lemma 3.3 we 
know that for any ordering a : V — > [n] we have 

£ 4>{e) + ]T 0(e) = \n{n 2 - l)(c + n + 1) = k + k' , 

SO 

E # e ) ^ k E ^ ^ fc '< 

which completes the proof. □ 

From now on, we only consider the maximization version of OQA(c). 
If G = (V, E) is a graph and X C V, we denote by S(X) the edge cut 

{{u,v) e E \ ue X Av eV\X}. 

Sometimes we simply write X for V \ X. Deciding whether G admits a cut of 
size k G Z + or greater, the MaxCut problem, is a fundamental NP complete 
problem. 

We introduce the following notation that we will use in the next two lemmas 
and the theorem that follows. Let a : V — > [n] be an arrangement for G = 
(V, E). For 1 < j < n, we define the set 

Xj = {v e V \ a(v) < j}. 

The sets Xj naturally induce cuts S(Xj). In the reduction from MaxCut we 
will need to rearrange isolated vertices in a given ordering. The following two 
lemmas give sufficient conditions for performing these rearrangements without 
decreasing the arrangement costs. 

Lemma 3.6. Let 1 < j < n and w G V an isolated vertex such that a(w) < j 
and \S(Xk)\ < |<5(X,-)| for all a(w) < k < j. Then for the ordering a' : V — > [n] 
defined by 

a(v) if a(v) < a(w) or j < a(v) 

a'(v) = < j if v = w 

a(v) — 1 if a(w) < a(v) < j 

we have q{a r ) > q(a>). 
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Figure 4: Illustration for the proof of Lemma 3.6 



Proof. For an edge e = (u, v) e E we may assume that a(u) < a(v). We denote 
the contribution of an edge e to the change of cost by A(e) := <p a '(e) — 4> a ( e )- 
Based on the positions of u and v in a relative to a(w) and j, we now calculate 
A(e); there are six cases to be considered (see Fig. 4). 

a(u) < a(v) < a(w) A(e) = 
a(u) < a(w) A j < a(v) => A(e) = 
a(u) < a(w) < a(v) < j => A(e) = -(2a(v) + c - 1) 
a(w) < a(u) < a(v) < j A(e) = +(2a(u) + c - 1) - (2a(y) + c - 1) 
a(w) < a(u) < j < a(v) => A(e) = +(2a(u) + c - 1) 
j < a(u) < a(v) => A(e) = 

We now quantify the global change of cost. For accounting purpose, it is 
useful to associate a change of cost ±(2a(x) + c — 1) with the vertex x (all cost 
changes are of that form). Notice that only vertices x £ V with a(w) < a(x) < j 
can have associated a change of cost with them. Moving from the j-th position 
in the arrangement to the left back to position or 1 (w) + 1, we pick up a positive 
change at vertex x if and only if |£(X a ( x ))| > \5{X a i x \_i)\ and a negative change 
if and only if |<5(X a (a;))| < If the size of the cut does not change 

at x, neither does the cost change (changes may cancel at that vertex though). 

Since none of the cuts on the left of j exceeds the size of the cut 6(Xj) and 
the absolute value of each change is strictly decreasing as we move to the left, 
the sum of accumulated changes stays non-negative throughout until we reach 
position a(w) + 1. But by reaching that position we have accounted for all 
changes due to the reordering, so we have q(a') > q(a). □ 

Lemma 3.6 describes circumstances that allow moving a single isolated vertex 
from the left into a locally largest cut without decreasing the arrangement costs. 
Unfortunately, moving isolated vertices from the right of that cut is not as easy. 
In fact the cost can decrease if we move such a single isolated vertex in a position 
where it intersperses the cut (see Appendix B). But there arc conditions under 
which we can move a block of isolated vertices from the right as the following 
lemma shows. 

Lemma 3.7. Let j, s, f e N be such that l<j<j + s<j + s + f<n, 
Itfpgi > \5(X j+k )\ fori <k< 3 +f and{a- l {j+s+l), . . . , a -^+ 8 +f)} C V 
are isolated vertices. Define the ordering a' by 

if a(v) < j or a(v) > j + s + f 
a '{ v ) — \ a(v) — s if j + s < a(v) < j + s + f 
if j < a(v) < j + s. 
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Figure 5: Illustration for the proof of Lemma 3.7. Edges symbolize the edge 
classes Ei. 



If 3 + 1 + ^ > \5{X j+a+f )\(s - 1) then we have q(a r ) > q{a). 

Proof. As in Lemma 3.6 we denote the change of cost when passing from a to 
a! for an edge e = (u,v) € E by A(e) and we assume that a(u) < a(v). Based 
on the positions of the end points, the edges can be devided into six disjoint 
sets (see Fig. 5): 



E l 


:={(« 




e £ 


a(tt) < a(u) < j} 


E 2 


:={(« 


«) 


e £ 


< 3 A j + s + / < a(v)} 


E 3 


:={(« 


t,) 


G E 


ot{u) < j < a(v) < j + s} 


E 4 


:={(« 


«) 


G £ 


j < a(u) < a(v) < j + s} 


E 5 


:={(« 


t,) 


G £ 


j < a(u) < j + s < j + s + f < a(v)} 


E e 


:={(« 


«) 


G £ 


j + s + f < a(u) < a(v)} 



From the definition of a', we see that A(e) = 0, for e G E\ U E 2 U E e . For the 
other three cases a short calculation shows that 



We now derive a lower bound for the cost difference of a' and a. We will 
use that 



\E 3 \ - |2?b| - \E 3 \ + \E 2 \ - (\E 5 \ + \E 2 \) = \S(Xj)\ - \S(X j+s+f )\ > 1, 



as well as \E 5 \ < \S(Xj + f +s )\. We immediately drop the non-negative contribu- 
tion from edges in E4 and calculate 



q(a')-q(«)> E f(Mv) + c + f)- ]T /(M«) + c + /) 



By assumption we have j + 1 + > \5(Xj +a+ f)\(s — 1), so the difference 



e £ E3 => A(e) 
e G £ 4 A(e) 
e G £ 5 A(e) 



/(2a(«) + c + /), 
2f(a(v) — a(u)) and 
_/(2a(«)+c + /). 




> \E 3 \f(2(j + 1) + c + /) - |£7 B |/(2(j + a) + c + / 2 ) 
= (|£ 3 | - |£ 5 |)/(2(j + 1) + c + /) - \E 5 \2f(s - 1) 

> /(2(j + 1) + c + /) - |5(X,- +a+/ )|2/(s - 1) 
= /(2(j + 1) + c + f - \S(X j+s+f )\2(s - 1)). 



q(a') — q(a) is non-negative. 



□ 
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Theorem 3.8. Let c = c^X 2 + C\X + cq be a polynomial of degree at most two 
with non-negative integer coefficients. Then MaxCut oc OQA(c). 

Proof. Let (G" = (V, E')-.k') be an instance of MaxCut. We define an instance 
(G = (V, E); k) for OQA by adding n 5 isolated vertices to G': Let W be set of 
size n 5 , then we set 

V = V'uW, E = E' and k = n w k / . 

Assume that G' admits a cut S(X') of size at least k' . We define an ordering 
a : V -> [n + n 5 } for G by 

a(X') = {l,...,\X'\} 
a(W) = {\X'\ + l,...,\X'\+n 5 } 
a(V \ X') = {\X'\ + n 5 + 1, . . . , n 5 + n}, 

where the ordering within the sets X' , W and V\ X' is arbitrary. We now derive 
a lower bound for q(a): Every edge e e S(X') induces a cost of at least 

0(e) > (n 5 + 2) 2 + c(n 5 + 2) - l 2 - c • 1 
= n 10 + (4 + c)n 5 + c + 3, 

so 

q(a) = ]T 0(e) > ^ 4>{e) > (n w + (4 + c)n 5 + c + 3) \6(X') \ 

e£E eeS(X') 

> n w k' = k. 

For the reverse direction assume that we are given an ordering a : V — > 
[n + n 5 ] such that q(a) > k. In order to show that G' has a cut of size at least 
k', we will first rearrange a, without decreasing the ordering cost, so that the 
vertices in W are ordered consecutively. This reordering process has two stages: 
First, using Lemma 3.6, we will move isolated vertices to the right so that they 
intersperse with locally largest cuts. This yields a block structure of isolated 
vertices of W to which we will then apply Lemma 3.7 in a second step. 

For the first stage, let b\ the the largest index of a maximum cut among the 
cuts S(Xi), that is, 

b\ = max{ argmax |<5(Xj)|}. 

l<z<n 5 +n 

Among the b\ vertices in X^ denote by n\ the number of vertices from V and 
by /i the number of vertices from W, so n\ = b\ + f\. By Lemma 3.6, we can 
rearrange a so that a~ 1 ({l, . . . , n{\) C V \ W and a~ 1 ({ni +1, . . . ,m +/i}) C 
W without decreasing the cost. 

Iterating this procedure on the vertices ordered after b\ , we obtain an order- 
ing in which the vertices appear partitioned in h parts, where in each part the 
vertices of V and W are ordered consecutively (see Fig. 6). More formally, the 
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Figure 6: Illustration for the block structure arising from moving isolated ver- 
tices closest to their rightmost largest cut. 



obtained ordering has the following properties: 

=: 6 < b l < b 2 < ■ ■ ■ < b h = n + n 5 , 
bk = max{ argmax |<5(JQ)|}, 1 < k < h, 

b k _i<i<n->+n 

\S(X bl )\>\S(X b2 )\>--->\S(X bh )\=0, 
nk + fk = bk - bk-i, 1 < k < h, 

a _1 ({&fc-i + 1, . . . , bk-i + n k }) C V \ W, l<k<h, 
a^dbk-i + n k + 1, . . . , 6 fc _i + n k + f k }) C W, 1 < A < /» 

Note that some of the fk may be zero while all > 0. Since ^(X^)! is 
trivially bounded by the linear cutwidth of the complete graph on n vertices 
and the size of the cuts 6(Xk) is strictly decreasing, we obtain h < ^ . 

Now begins the second stage of the rearrangement. From the given block 
structure, we will perform a series of rearrangements using Lemma 3.7 until 
eventually all vertices from W intersperse between the sets X bl and X bl . Each 
of the reordering operations will maintain the block structure as a whole, but 
the individual values of the fk will change. In order to simplify notation, we 
will not explicitly distinguish between different orderings a and values f k } s at 
the different stages during the process. 

Let v € argmax 1<fe<ft fk\ since ^2 fk — n 5 and h < we have f v > 4n 3 . 
Define j := b v -\,s := n v ,f := f v . By construction we have that |<5(Xj)| > 
\S{X j+k )\ for 1 < k < s + f and 

3 + 1 + > { > 2^ 3 > - 1) > \S(X j+a+f )\(s 1). 

So the assumptions of Lemma 3.7 are met and in the rearranged ordering we 
now have > 4n 3 and f v = 0. By induction we obtain an ordering in which 
the block structure satisfies f\ > 4n 3 and fi = • • • = f v = 0. 

Next set j := b x > 4n 3 , s := Ylk=2( n k + fk) + n v+ i = J2 k tl n k < n and 
/ := fv+i- By construction we have that > for 1 < k < s + f 

and 

3 + 1 + > J > 4« 3 > ^(n - 1) > \6(X j+s+f )\(s 1). 

This permits us to apply Lemma 3.7 and in the rearranged ordering we now 
have f v+ \ = while /1 > 4n 3 is maintained. By induction we arrive at an 
ordering where / 2 = • • • = fh = 0, which implies fi — n 5 . Denote this final 
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ordering by a' . Since none of the reordering operations has ever decreased the 
total arrangement cost, we have q(a') > q(a) > k, where a is the very original 
ordering that we started with. 

Next we derive an upper bound for q(a'). We classify the edges of G in three 
different categories and bound the contribution from each of these sources. 

1. If (u, v) € X\ x X\, then the total cost of these edges is strictly bounded 
by the arrangement cost of a clique of size n being ordered at positions 1, . . . , n. 
By Lemma 3.3, this cost is \n(n 2 — l)(c + n + 1). 

2. If e = (u, v) e X\ x Xi, then the cost implied by e is at most (n 5 + n) 2 + 
c(n 5 + n) - l 2 - c. 

3. If (u, v) € X\ x X\, then the total cost of these edges is strictly bounded 
by the arrangement cost of a clique of size n being ordered at positions n 5 + 
1, . . . ,n 5 + n. By Lemma 3.4, this cost is \n{n 2 - l)(2n 5 + c + n + 1). 

In total we obtain 



n w k' = k < q(a) < q(a') 

1 -4- rl n 5 -4- r>) - 1 - ^ -I- 



< \6(X 1 )\((n 5 + nf + c(n 5 + n) - 1 - c) + J(n 2 - l)(c + n + 1) 



+ -(n 2 -l)(2n 5 + c + n + 1) 
6 

< ^(X^Kn 10 + 2n 6 + cn 5 + n 2 + cn) + ^n(n 2 )(n 5 + c + n+1) 
=> k' < 15(^)1 + ^(XOI ^ + ^ 



— :r(n) 



Since |5(-Xi)| < \, we have 



1 c 1 c 1 c 1 1 

r ^ - 2^ + 4^ + 4^ + 4V + 3^ + 3V + 3n6 + 3«7' 

Because c is a polynomial of degree at most two, there exists an integer n c eN 
such that 

r(n) < 1, for all n > n c . 
Together with the integrality of |<5(Xi)| and k 1 , it follows that \5(X{)\ > k'. □ 



3.2 Reduction from OQA to the minimum FLOPs prob- 
lem 

In this section we reduce OQA(c) to the minimum FLOPs problem for a certain 
polynomial c. Our strategy follows the pattern that Yannakakis used for the 
reduction of OLA to minimum fill [22], but again the details are much differ- 
ent. In particular we employ a quadratic variation of the bipartite chain graph 
completion problem, which we discuss in section 3.2.1. In section 3.2.2 we give 
a reduction from OQA(c) to this quadratic chain completion problem. 



3.2.1 Reduction from bipartite quadratic chain completion 

Let G = (P U Q, E) be a bipartite graph onp + g vertices, p := \P\,q ■= \Q\. 
Recall that for a vertex v e P we denote its neighbourhood in G by J\f(v). G is 
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a bipartite chain graph if there exists a bijection a : P — >• [p] such that 

7V(a" 1 (i)) D Af(a-\i + 1)) , 1 < i < p - 1. (9) 

Note that G admits such a chain ordering for P if and only if G admits a chain 
ordering for Q, so the definition does not depend on a particular partition of G. 
For a bipartite graph, the property of being a chain graph is hereditary and the 
minimal obstruction set is {2K2} [22, Lemma 1]. 

Yannakakis considers the problem of completing a given bipartite graph into 
a bipartite chain graph. We formulate the corresponding decision problem in 
terms of vertex degrees: 

BipartiteChainCompletion (BCC) 

Instance: Bipartite graph G = (P U Q, E), k e N 

Question: Is there a set of edges F C P x Q such that G + = 

(P U Q, E U F) is a chain graph and J2 v eP d G+ (v) < k ? 

Note that our metric of measuring the cost of the chain completion is equiv- 
alent to minimizing |F| in the formulation above, because 

J2d G+ (v) = \E\ + \F\. 

Our quadratic variation of the bipartite chain completion problem has a cost 
function which is a quadratic function of the vertex degrees in the augmented 
graph. 

QuadraticChainCompletion (QCC) 

Instance: Bipartite graph G = (P U Q, E) on p + q vertices 
where the partition P is designated, k e N 
Question: Does there exist a set of edges F C P x Q such that 
G+ = (P U Q, E U F) is a chain graph with 

qcc(F) := d G+ {v? + 2(p + 1) d G+ («) < M 

v£P v£P 

Unlike for BCC, it is not clear whether the minima of our quadratic variation 
depend on the particular vertex partition chosen, which is why the information 
which partition to consider is part of the input. Of course, the particular cost 
value (defined by qcc) of a bipartite chain graph embedding depends on the 
partition (for example, consider the simple path on three vertices). 

The reduction from BCC to MinimumFill in [22] involves a construction 
that relates certain triangulations to chain embeddings, which we adapt to our 
needs. 

Definition 3.9. Let G = (P U Q,E) a bipartite graph on p + q vertices and 
U = {u v I v G P} a set of p vertices. We define the graph C = C(G) = (V , E') 
by 

V = P U Q U U 

E' = E U (P x P) U ((Q U U) x (Q U U)) U {{v, u v ) \ v G P}. 
Further, for a given bijection a : P — > \p], we define the set 

G(a) = {(a- 1 (i),u a - 1{j) ) \ l<i<j<p}cPxU. 
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The set U is the difference to Yannakakis construction C(G). The next 
lemmas desribe how chain completions of G relate to triangulations of C(G) 
and G(a), giving an analogon to [22, Lemma 2]. 

Definition 3.10. Let M be a set of m elements and a : M — > [m] a bijection. 
Then the reverse bijection a R : M — »• [m] is uniquely defined by the property 
a~ R (i) := (a- R ) _1 (j) = or x (m — i + 1) /or 1 < i < m. 

For the following we recall that a minimal triangulation for a graph is an in- 
clusion minimal set of edges whose addition yields a chordal graph. Analogously 
we will speak of minimal chain completions for a given bipartite graph. There is 
no loss of generality if we assume that the decision problems from above are re- 
stricted to minimal completions. Recall also that a PEO for a graph G = (V, E) 
is a bijection a : V — > [n], n = \V\, such that eliminating vertices in the order 
implied by a -1 does not cause any fill. By a prefix of a PEO a we mean a 
restriction a\ w for some W <ZV such that a _1 (fc) — a7 w (k), for 1 < k < \W\. 

Lemma 3.11. Let G = (Pi) Q,E) be a bipartite graph, C = C(G) = (V , E') = 
(P U Q U U, E') and F' C V' X V a minimal triangulation of C. Set F[j := 
F'flfPx U), Fq := F'n(PxQ). Then there exists a bijection a : P ->• [p] 
such that 

i) F{j = G(a), 

ii) (P U Q, E U Fq) is a chain graph and admits a as a chain ordering for 

P. 

Proof. Since P and QUU are already cliques in C, we have F' C P x (Q U U), 
so F' = F(j U F'q is a partitioning of F' . Since F' is minimal, there exists a 
PEO f3 for C + such that = C + , and because Q U {/ is a clique in C + , we 
can choose j3 so that it orders Q U U last [20, Corollary 4], that is, 

/^({l,--- ,p}) = P, ^({p+l,... ,2p + g }) = QUC7. 

Denote by iVj the neighborhood of the vertex in the reduced elim- 

ination graph at step j and F'- the set of fill edges introduced at step j that 
are incident with U. We will show the following statement by induction (for 
1 < j < p) : In the j-th elimination step, we have 

Nj n(PUU) = {^(i) \j<i< P }U {u p -i (i) | 1 < i < j], 
F' j = {{[3-i(i),up- Hj) )\j<i<p}. 

By inspection of the graph C we find that the statement is true for j — 1. 
Next assume that the statement is true for all k with 1 < k < j. By the 
induction assumption, the fill edges incident with U introduced up to step j are 

U F fe = UW 1 W.«fl- 1 (fc))|A<i<P}- ( 10 ) 
fc=i fe=i 

So at the elimination step j, the set of vertices of U that the vertex € P 

is adjacent to because of any prior fill edge is {up-iu\ \ 1 < i < j}, so we obtain 



Nj n(Puu) = {rHi) \3 < * < P} U {^-iw I 1 < * < j}. 
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Since the edges (10) are already present at step j, the only edges that need to 
be added in order to turn this set of vertices into a clique 

{{p- 1 (i),u -, [j) )\j<i<p} = F' j , 

which completes the proof of the claim. 

Let a :— ((3\ P ) R . Noting that F^ = 0, it follows from the claim that 

p— l p— l 

F'n(PxU)=\jF' k =\J{(l3- 1 (i),u f! - 1{j) )\j<i<p} 

= {(a _1 W> u a-M.7)) I 1 < * < J < P> = G(a). 

Now we have constructed a and shown (i). To show (ii), note that P is a 
clique in C(G) and a R = f3\p is also a prefix of a PEO for the induced subgraph 
C + [P U Q}. So by the construction of C, a is a chain ordering for P in in 
(PUQ,£?UF£). ' □ 

The previous lemma characterizes minimal triangulations of C(G): They 
decompose into a chain completion for G and a set G(a) such that a is a 
compatible chain ordering. The next two lemmas give a reverse direction, so 
every triangulation of C (G) uniquely defines a chain completion of G and vice 
versa. 

Lemma 3.12 (chordal patching lemma, folklore). Let G = (V,E) be a graph 
where the vertices are partitioned in three sets V = A U B U C . Then G is 
chordal if the following three conditions are satisfied: 

1. G[V\ C] has two connected components A,B, 

2. G[C] is a clique, 

3. G[A U C] and G[B\JC] are chordal. 

Proof. Let Z be a simple cycle of length at least 4 in G. If Z is entirely contained 
in A U C or B U C, then Z has a chord. Otherwise, Z contains vertices both of 
A and B, so Z intersects C at least at two non-consecutive vertices of Z, which 
gives a chord in Z since C is a clique. □ 

Lemma 3.13. Let G = (P U Q,E) be a biapartite graph and let F C P x Q 
such that G + = (P U Q, E U F) admits a : P — > [p] as a chain ordering. Then 
F' = F U G(a) is a triangulation for C — C(G) = (V',E') and a R is a prefix 
of a PEO for C+ = {V , E' U F'). 

Proof. Let C*+ = C+[P U Q] and C+ = C+[P U U}. We first show that C+ 
and Cy are chordal. A chordless cycle in Cq implies an induced subgraph in 
G + isomorphic to 2K 2 , which contradicts the assumption that G + is a bipartite 
chain graph. So Cq is chordal. 

From the definition of G(a) it follows that we can use a R to carry out p steps 
of vertex elimination in C^ without introducing a fill edge. But after these p 
steps only a clique of size p remains, so admits a PEO which implies that 
Cf r is chordal. 
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Noting that P is a clique in C + , it follows from Lemma 3.12 that C + is 
chordal. Since G + is a chain graph and P a clique in C + , we also have elimi- 
nating along a R in C + does not introduce a fill edge, so a R is a prefix of a PEO 
for C+. □ 

The set G(a) in any triangulation C + of C(G) simplifies the FLOP counting 
in the reduction from QuadraticChainCompletion as we will see now. 

Theorem 3.14. QuadraticChainCompletion oc MinimumFLOPs. 

Proof. As before, we continue to use the notation from Definition 3.9. By 
Lemmas 3.11 and 3.13 every chain completion F of G gives a triangulation 
F' = FUG(a) for C(G) and vice versa. Further, the chain orderings correspond 
to reversed prefixes of PEOs and vice versa. We show: There exists a chain 
completion of cost at most k if and only if we can triangulate C(G) with FLOP 
count of at most k' := k + p(p + l) 2 + Yfiti ^ '■ 

If F is a set of edges whose addition to G yields a chain graph G + with chain 
ordering a for P, then a R starts a PEO for the corresponding triangulation of 
C{G). We will calculate the elimination degrees. At the i-th elimination step, 
the vertex a R (i) is adjacent to p — i vertices in P, d G +(a R (i)) vertices in Q and 
i vertices in U. So the p elimination degrees associated with a R are 

d(a R (i)) = P -i + d G+ (a R (i)) +i 

= p + d G +(a R {i)), l<i<p. 

After the elimination of these first p vertices, a clique of size p + q remains, so a 
PEO a' for C + is obtained by completing a R arbitrarily. For the FLOP count 
we find: 

p p+q 

■2 



p+q 

2 



flop(a') = ^> + 1 + d G+ (a R {i))f + £Y 

»=1 i=l 

= E d c+ + 2(p + 1) Y, d c+ ( v ) + p(p + !) 2 + E 

v£P v£P 

p+q 

= qcc(F) +p(p+ l) 2 



P+9 



Since the FLOP count does not depend on the particular PEO a' for C + , 
the FLOP count induced by the triangulation F' is less than k' if and only if 
the quadratic chain completion cost of F is less than k. □ 

If we would omit the vertices U from the construction of C(G), the vertex 
degrees (11) would depend on the position of the vertices in the ordering a. The 
implied quadratic cost function for the chain completion problem would make 
the treatment that follows much more difficult. 



3.2.2 Reduction from optimal quadratic arrangement 

In section 3.1 we have shown that OQA(c) is an NP hard problem for any choice 
of the polynomial c in (7). For the rest of the section we are interested only in 
the special case OQA(2(X 2 + 1)), from which we reduce to the QCC problem. 
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This polynomial is intentionally chosen to match up with the 2(p+ 1) factor in 
the formulation of the QCC problem. 

The following construction for creating a bipartite graph G' = (P U Q, E') 
from a given graph G — (V, E) is used in [22, Lemma 3] (an example is provided 
there). For a vertex v define the set R(v) '■— {w\, . . . ,wf | l v = n — dc(i;)}, 
then G' is given by 

P = V, Q = {w\, w\ | e G E} (J R(v) and 

vev 

E' = {{u, wf)\eeE,ueV,ee 5(u), 1 < i < 2} (12) 

u {(v,w) | v e V, w e R(v)}. 

The construction of G' is such that all inclusion minimal chain completions 
can be easily characterized from vertex orderings of G, as the next lemma shows. 

Lemma 3.15 (extracted from [22, Lemma 3]). Let a : V — > [n] be an or- 
dering for the vertices of G = (V, E) and for w € Q, define a(w) = max{« | 
(w, a _1 (i) e E'}. Then the set of edges 

H(a) = {(a- 1 (j),w) | weQJ <a(w)}\E' (13) 

is a set of edges whose addition to G' yields a bipartite chain graph with chain 
ordering a for P. Moreover, for any minimal set of edges F such that (P, Q, E'U 
F) is a bipartite chain graph with P -ordering a, we have F — H{a). 

Theorem 3.16. Let c = 2(X 2 + 1), then OQA(c) cx QCC. 

Proof. Let (G = (V, E); k) be an instance of OQA with |V| = n, \E\ = m. Let 
G' be constructed as in (12). We define an instance for QCC by (G'; k +p(n)), 
where p(n) = \n 2 {n + l)(2n + 3c(n) + 1), and regard Q as the designated 
partition for the decision problem. For the number of vertices in Q we find 



|Q| =2m+Y^ \ R ( V )\ = 2m + n ~ dc ^ = 2m + n 2 - 2m : 



n 2 . 



vev vev 



By Lemma 3.15, we only need to relate the quadratic ordering cost of an ar- 
bitrary vertex ordering a : V — > [n] for G to the quadratic chain completion 
cost for H(a) for G'. Set G' + = (P,Q,E' U H{a)) and assume for all edges 
e = (u, v) e E that we have a(u) < a(v). For every vertex w\ G Q, we have 
d G >+(wf) = a(v). For any v G V we have d G >+(w) = a(v) for all vertices 
w G R(v). We abbreviate := n — do(v) and find for the total quadratic chain 
completion cost: 

qcc(H(a)) = ^ (d G , + (^) 2 + 2(n 2 + l)d G , a+ (w)) 

WEQ ^ =c(n) 

= 2 J2 nv) 2 + c(n)a(v)) + J2 E (u{v) 2 + c{n)a{v)) 
(u.v)eE vev xeii(v) 

= 2 ^ (a(v) 2 + c(n)a(vj) + ^2(n- d G (v))(a(v) 2 + c(n)a(v)) 
(u,v)eE vev 

+ E (a(u) 2 + c{n)a{u)) - ^ (a(u) 2 + c(n)a(u)) 
(u,v)eE (u,v)eE 
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= ^ (a(v) 2 — a(u) 2 + c(n)(a(v) ~ a(u))) 

(u,v)£E 

+ (u(v) 2 + a(u) 2 +c{n)(a(u)+a{v)) + ^l v (a{v) 2 + c(n)a(v)) 

(u,v)eE vEV 

= q(a) + ^2 <^G(v){a{v) 2 + c(n)a(v)) + y~](n - d G (v))(a(v) 2 + c(n)a(v)) 
vev vev 

= q(a) + n ^(a(w) 2 + c(n)a(v)) = q(a) +p(n). 

vev 

This shows q(a) < k qcc(H(a)) < k + p(n), which completes the proof. □ 

4 Conclusions and future work 

In this work we have shown by means of an explicit, scalable construction that 
minimum fill and minimum operation count for the sparse Cholesky factorization 
are not achievable simultaneously in general. We proved that minimizing the 
number of arithmetic operations is just as difficult as minimizing the fill in: it is 
NP hard. While this result is not surprising, no proof has been given so far, and 
thus our findings close a gap in the theoretical body of sparse direct methods. 

It would be of interest to understand how well optimal fill orderings approx- 
imate the optimal number of arithmetic operations (and vice versa) . Approxi- 
mation bounds based on general equivalence constants for the 1- and 2-norm or 
bounds based on full fc-tree embeddings (e.g. [20, prop. 3]) are too coarse for 
offering an quantitative insight into this question. 
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Figure 7: A graph where minimum quadratic- and linear arrangement costs are 
attained on distinct ordcrings. 



A OQA(c) and OLA are different 

We show that OLA and OQA are different problems in the sense that opti- 
mizing the linear arrangement cost does not necessarily optimize the quadratic 
arrangement cost (and vice versa). Let n > 4, C a set of size n, u,v £ C two 
distinct elements and consider the following class of graphs G = (V, E) (see Fig. 
7): 

V = Cii{x,y}, E = CxCU{(x,u),(y,v)} 

It is easy to see that any linear or quadratic arrangement where x or y 
intersperses with the vertices of C is suboptimal. If x, y are ordered before 
or after C, any ordering that does not place u,v as close as possible to x,y 
is suboptimal, too. Ruling out those suboptimal orderings, only five different 
orderings (modulo cost-neutral rearrangements of C) remain; they are displayed 
in Fig. 7 on the right. We calculate the linear arrangement costs: 

l(ai) = l -n{n 2 - 1) + 2, l(a 2/3 ) = l(a 4/6 ) - \n{n 2 - 1) + 4, 

so ai is an optimal linear arrangement while the others are not. Using Lemma 
3.4 we find the quadratic arrangement costs 

q(ai) = \n(n 2 - l)(c + n + 3) + 2n + c + 3, 
o 

q(c*2/3) = ^n(n 2 - l)(c + n + 5) + 4c + 20, 

q("4/5) = jU(™ 2 - l)(c + n + 1) + 8n + Ac + 4. 

It is easy to see that q(a 4 /5) is strictly less than the other costs for sufficiently 
large n (recall that c is fixed). Thus OQA and OLA are different problems for 
every polynomial c. 
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Figure 8: Moving an isolated vertex to the left into the largest cut may decrease 
the total arrangement cost. The numbers shown next to the edges are the edge 
costs; the arrangement on the left has a total cost of 102 while the arrangement 
on the right has cost 101. Here the distance function is f(x) = x 2 . 



B Moving isolated vertices to the left 

In the reduction from MaxCut to OQA(c), we needed to rearrange isolated 
vertices within a given ordering without decreasing the costs. Fig. 8 shows an 
arrangement of a graph on 8 vertices, of which one is isolated. If the isolated 
vertex is moved to the left so that it intersperses with the largest cut, the 
arrangement cost decreases. This is why we need to resort to rearranging blocks 
of isolated vertices. 



